Transient and asymptotic dynamics of reinforcement learning in games

نویسندگان

Luis R. Izquierdo

Segismundo S. Izquierdo

Nicholas Mark Gotts

J. Gareth Polhill

چکیده

Reinforcement learners tend to repeat actions that led to satisfactory outcomes in the past, and avoid choices that resulted in unsatisfactory experiences. This behavior is one of the most widespread adaptation mechanisms in nature. In this paper we fully characterize the dynamics of one of the best known stochastic models of reinforcement learning [Bush, R., Mosteller, F., 1955. Stochastic Models of Learning. Wiley & Sons, New York] for 2-player 2-strategy games. We also provide some extensions for more general games and for a wider class of learning algorithms. Specifically, it is shown that the transient dynamics of Bush and Mosteller’s model can be substantially different from its asymptotic behavior. It is also demonstrated that in general—and in sharp contrast to other reinforcement learning models in the literature—the asymptotic dynamics of Bush and Mosteller’s model cannot be approximated using the continuous time limit version of its expected motion. © 2007 Elsevier Inc. All rights reserved. JEL classification: C73

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Behavioral Learning Process in Games

The paper studies the cumulative proportional reinforcement (CPR) rule, according to which an agent plays, at each period, an action with a probability proportional to the cumulative utility that the agent has obtained with that action. the asymptotic properties of this learning process are examined for a decision-maker under risk, where it converges almost surely toward the expected utility ma...

متن کامل

Stochastic Stability of Perturbed Learning Automata in Positive-Utility Games

This paper considers a class of discrete-time reinforcement-learning dynamics and provides a stochasticstability analysis in repeatedly played positive-utility (strategicform) games. For this class of dynamics, convergence to pure Nash equilibria has been demonstrated only for the fine class of potential games. Prior work primarily provides convergence properties through stochastic approximatio...

متن کامل

Convergent Multiple-times-scales Reinforcement Learning Algorithms in Normal Form Games

We consider reinforcement learning algorithms in normal form games. Using two-time-scales stochastic approximation, we introduce a modelfree algorithm which is asymptotically equivalent to the smooth fictitious play algorithm, in that both result in asymptotic pseudotrajectories to the flow defined by the smooth best response dynamics. Both of these algorithms are shown to converge almost surel...

متن کامل

Convergent Multiple-timescales Reinforcement Learning Algorithms in Normal Form Games

We consider reinforcement learning algorithms in normal form games. Using two-timescales stochastic approximation, we introduce a modelfree algorithm which is asymptotically equivalent to the smooth fictitious play algorithm, in that both result in asymptotic pseudotrajectories to the flow defined by the smooth best response dynamics. Both of these algorithms are shown to converge almost surely...

متن کامل

Reinforcement learning based feedback control of tumor growth by limiting maximum chemo-drug dose using fuzzy logic

In this paper, a model-free reinforcement learning-based controller is designed to extract a treatment protocol because the design of a model-based controller is complex due to the highly nonlinear dynamics of cancer. The Q-learning algorithm is used to develop an optimal controller for cancer chemotherapy drug dosing. In the Q-learning algorithm, each entry of the Q-table is updated using data...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

Games and Economic Behavior

دوره 61 شماره

صفحات -

تاریخ انتشار 2007

Transient and asymptotic dynamics of reinforcement learning in games

نویسندگان

چکیده

منابع مشابه

A Behavioral Learning Process in Games

Stochastic Stability of Perturbed Learning Automata in Positive-Utility Games

Convergent Multiple-times-scales Reinforcement Learning Algorithms in Normal Form Games

Convergent Multiple-timescales Reinforcement Learning Algorithms in Normal Form Games

Reinforcement learning based feedback control of tumor growth by limiting maximum chemo-drug dose using fuzzy logic

عنوان ژورنال:

اشتراک گذاری